13 research outputs found

    Text Similarity Between Concepts Extracted from Source Code and Documentation

    Get PDF
    Context: Constant evolution in software systems often results in its documentation losing sync with the content of the source code. The traceability research field has often helped in the past with the aim to recover links between code and documentation, when the two fell out of sync. Objective: The aim of this paper is to compare the concepts contained within the source code of a system with those extracted from its documentation, in order to detect how similar these two sets are. If vastly different, the difference between the two sets might indicate a considerable ageing of the documentation, and a need to update it. Methods: In this paper we reduce the source code of 50 software systems to a set of key terms, each containing the concepts of one of the systems sampled. At the same time, we reduce the documentation of each system to another set of key terms. We then use four different approaches for set comparison to detect how the sets are similar. Results: Using the well known Jaccard index as the benchmark for the comparisons, we have discovered that the cosine distance has excellent comparative powers, and depending on the pre-training of the machine learning model. In particular, the SpaCy and the FastText embeddings offer up to 80% and 90% similarity scores. Conclusion: For most of the sampled systems, the source code and the documentation tend to contain very similar concepts. Given the accuracy for one pre-trained model (e.g., FastText), it becomes also evident that a few systems show a measurable drift between the concepts contained in the documentation and in the source code.</p

    Extracting and Comparing Concepts Emerging from Software Code, Documentation and Tests

    Get PDF
    Traceability in software engineering is the ability to connect different artifacts that have been built or designed at various points in time. Given the variety of tasks, tools and formats in the software lifecycle, an outstanding challenge for traceability studies is to deal with the heterogeneity of the artifacts, the links between them and the means to extract each. Using a unified approach for extracting keywords from textual information, this paper aims to compare the concepts extracted from three software artifacts: source code, documentation and tests from the same system. The objectives are to detect similarities in the concepts emerged, and to show the degree of alignment and synchronisation the artifacts possess. Using the components of three projects from the Apache Software Foundation, this paper extracts the concepts from 'base' source code, documentation, and tests (separated from the source code). The extraction is done based on the keywords present in each artifact: we then run multiple comparisons (through calculating cosine similarities on features extracted by word embeddings) in order to detect how the sets of concepts are similar or overlap. For similarities between code and tests, we discovered that using pre-trained language models (with increasing dimension and corpus size) correlates to the increase in magnitude, with higher averages and smaller ranges. FastText pre-trained embeddings scored the highest average of 97.33% with the lowest range of 21.8 across all projects. Also, our approach was able to quickly detect outliers, possibly indicating drifts in traceability within modules. For similarities involving documentation, there was a considerable drop in similarity score compared to between code and tests per module - down to below 5%.</p

    Artifact Traceability in DevOps:An Industrial Experience Report

    Get PDF
    In DevOps, the traceability of software artifacts is critical to the successful development and operation of project delivery to stakeholders. Before the introduction of end-to-end traceability in DevOps at a Data Analytics team at bp (BP plc), an international integrated energy company, the tracing of artifacts throughout a project life cycle was manual and time-consuming. This changed when traceability become more automated with end-to-end traceability capability as an offering on the platform. This paper reports on the ways of working and the experience of developers implementing DevOps for developing and putting in production a Javascript React web application, with a focus on traceability management of artifacts produced throughout the life cycle. This report highlights key opportunities and challenges in traceability management from the development stage to production.</p

    Fundamentals of Entrepreneurship ENT300 : Merzazusa Car Rental / Muhammad Zaki Pauz... [et al.]

    Get PDF
    The Merzazusa car rental management team will be leads by General Manager who is responsible on preparing the business plan for the business agreed upon that to be opened, leading the other assisting management, organizing the task to be given to the assisting management, and controlling the task run by the assisting management. The marketing manager, on the other hand, will the one with the biggest responsibility on preparing the business marketing plan which includes the marketing process on identifying the target market, determining the market size, identifying the market competitors, determining the market share, developing sales forecast and finally marketing strategies. The operating manager will be the one who controlling and monitoring the business operating hour as well as noting the equipment and facilities that will be required in the Merzazusa car rental. Administration Manager is the one who is responsible for the office administration that involve the work of recruiting workers and office admin which will be establish soon. Last but not least, the finance Manager will be the most important person in our management as he is the person that will be managing the finance of Merzazusa car rental company from the budget he will distributed and final financial statement made that will be the peak point of the success or not of the business. This Merzazusa will give us chance to gain profit. It is normal at the early stage of business our company do not gain profit. The reason why our business should be go on because we have a strategic place which near with education place. Instead that our target are students. We believe that our business can be develops

    A Rare Tumor in the Neck of a Child: Plexiform Neurofibroma

    Get PDF
    Plexiform neurofibroma represents an uncommon variant of neurofibromatosis type 1, constituting only 5%–30% of all cases. Plexiform neurofibroma is usually diagnosed during childhood and arises from multiple nerves, manifesting as bulging and deforming masses that can also involve connective tissue and skin folds. We report a case of a two-year-old girl who presented with worsening stridor since birth and later exhibited progressively increasing left neck swelling at the age of 10 months old. Ultrasound and magnetic resonance imaging (MRI) showed a lobulated solid mass in the left deep neck space extending to the midline and having a mass effect on the airway with involvement of the supraglottic region. Tracheostomy was done, and a biopsy of the supraglottic lesion revealed a plexiform neurofibroma. The patient was conservatively managed after a discussion with her parents concerning the associated potential of operative morbidity. The patient’s parents had learned about tracheostomy care, and the patient was scheduled for yearly MRI surveillance. MRI was performed again three months after the initial diagnosis and showed stable lesion. Plexiform neurofibroma is a slow-growing tumor. A treatment decision must consider the benefits of surgery and the morbidity of the progressing disease. Hence, airway management is crucial prior to the final decision of such cases

    Applications of natural language processing in software traceability: A systematic mapping study

    Get PDF
    A key part of software evolution and maintenance is the continuous integration from collaborative efforts, often resulting in complex traceability challenges between software artifacts: features and modules remain scattered in the source code, and traceability links become harder to recover. In this paper, we perform a systematic mapping study dealing with recent research recovering these links through information retrieval, with a particular focus on natural language processing (NLP). Our search strategy gathered a total of 96 papers in focus of our study, covering a period from 2013 to 2021. We conducted trend analysis on NLP techniques and tools involved, and traceability efforts (applying NLP) across the software development life cycle (SDLC). Based on our study, we have identified the following key issues, barriers, and setbacks: syntax convention, configuration, translation, explainability, properties representation, tacit knowledge dependency, scalability, and data availability. Based on these, we consolidated the following open challenges: representation similarity across artifacts, the effectiveness of NLP for traceability, and achieving scalable, adaptive, and explainable models. To address these challenges, we recommend a holistic framework for NLP solutions to achieve effective traceability and efforts in achieving interoperability and explainability in NLP models for traceability

    From Descriptive to Predictive:Forecasting Emerging Research Areas in Software Traceability Using NLP from Systematic Studies

    No full text
    Systematic literature reviews (SLRs) and systematic mapping studies (SMSs) are common studies in any discipline to describe and classify past works, and to inform a research field of potential new areas of investigation. This last task is typically achieved by observing gaps in past works, and hinting at the possibility of future research in those gaps. Using an NLP-driven methodology, this paper proposes a meta-analysis to extend current systematic methodologies of literature reviews and mapping studies. Our work leverages a Word2Vec model, pre-trained in the software engineering domain, and is combined with a time series analysis. Our aim is to forecast future trajectories of research outlined in systematic studies, rather than just describing them. Using the same dataset from our own previous mapping study, we were able to go beyond descriptively analysing the data that we gathered, or to barely 'guess' future directions. In this paper, we show how recent advancements in the field of our SMS, and the use of time series, enabled us to forecast future trends in the same field. Our proposed methodology sets a precedent for exploring the potential of language models coupled with time series in the context of systematically reviewing the literature.</p
    corecore